{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Maxpooling Layer\n",
    "\n",
    "In this notebook, we add and visualize the output of a maxpooling layer in a CNN. \n",
    "\n",
    "A convolutional layer + activation function, followed by a pooling layer, and a linear layer (to create a desired output size) make up the basic layers of a CNN.\n",
    "\n",
    "<img src='notebook_ims/CNN_all_layers.png' height=50% width=50% />"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Import the image"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import cv2\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "# TODO: Feel free to try out your own images here by changing img_path\n",
    "# to a file path to another image on your computer!\n",
    "img_path = 'data/udacity_sdc.png'\n",
    "\n",
    "# load color image \n",
    "bgr_img = cv2.imread(img_path)\n",
    "# convert to grayscale\n",
    "gray_img = cv2.cvtColor(bgr_img, cv2.COLOR_BGR2GRAY)\n",
    "\n",
    "# normalize, rescale entries to lie in [0,1]\n",
    "gray_img = gray_img.astype(\"float32\")/255\n",
    "\n",
    "# plot image\n",
    "plt.imshow(gray_img, cmap='gray')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Define and visualize the filters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "## TODO: Feel free to modify the numbers here, to try out another filter!\n",
    "filter_vals = np.array([[-1, -1, 1, 1], [-1, -1, 1, 1], [-1, -1, 1, 1], [-1, -1, 1, 1]])\n",
    "\n",
    "print('Filter shape: ', filter_vals.shape)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Defining four different filters, \n",
    "# all of which are linear combinations of the `filter_vals` defined above\n",
    "\n",
    "# define four filters\n",
    "filter_1 = filter_vals\n",
    "filter_2 = -filter_1\n",
    "filter_3 = filter_1.T\n",
    "filter_4 = -filter_3\n",
    "filters = np.array([filter_1, filter_2, filter_3, filter_4])\n",
    "\n",
    "# For an example, print out the values of filter 1\n",
    "print('Filter 1: \\n', filter_1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Define convolutional and pooling layers\n",
    "\n",
    "You've seen how to define a convolutional layer, next is a:\n",
    "* Pooling layer\n",
    "\n",
    "In the next cell, we initialize a convolutional layer so that it contains all the created filters. Then add a maxpooling layer, [documented here](http://pytorch.org/docs/stable/_modules/torch/nn/modules/pooling.html), with a kernel size of (2x2) so you can see that the image resolution has been reduced after this step!\n",
    "\n",
    "A maxpooling layer reduces the x-y size of an input and only keeps the most *active* pixel values. Below is an example of a 2x2 pooling kernel, with a stride of 2, applied to a small patch of grayscale pixel values; reducing the x-y size of the patch by a factor of 2. Only the maximum pixel values in 2x2 remain in the new, pooled output.\n",
    "\n",
    "<img src='notebook_ims/maxpooling_ex.png' height=50% width=50% />"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import torch\n",
    "import torch.nn as nn\n",
    "import torch.nn.functional as F\n",
    "    \n",
    "# define a neural network with a convolutional layer with four filters\n",
    "# AND a pooling layer of size (2, 2)\n",
    "class Net(nn.Module):\n",
    "    \n",
    "    def __init__(self, weight):\n",
    "        super(Net, self).__init__()\n",
    "        # initializes the weights of the convolutional layer to be the weights of the 4 defined filters\n",
    "        k_height, k_width = weight.shape[2:]\n",
    "        # assumes there are 4 grayscale filters\n",
    "        self.conv = nn.Conv2d(1, 4, kernel_size=(k_height, k_width), bias=False)\n",
    "        self.conv.weight = torch.nn.Parameter(weight)\n",
    "        # define a pooling layer\n",
    "        self.pool = nn.MaxPool2d(2, 2)\n",
    "\n",
    "    def forward(self, x):\n",
    "        # calculates the output of a convolutional layer\n",
    "        # pre- and post-activation\n",
    "        conv_x = self.conv(x)\n",
    "        activated_x = F.relu(conv_x)\n",
    "        \n",
    "        # applies pooling layer\n",
    "        pooled_x = self.pool(activated_x)\n",
    "        \n",
    "        # returns all layers\n",
    "        return conv_x, activated_x, pooled_x\n",
    "    \n",
    "# instantiate the model and set the weights\n",
    "weight = torch.from_numpy(filters).unsqueeze(1).type(torch.FloatTensor)\n",
    "model = Net(weight)\n",
    "\n",
    "# print out the layer in the network\n",
    "print(model)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Visualize the output of each filter\n",
    "\n",
    "First, we'll define a helper function, `viz_layer` that takes in a specific layer and number of filters (optional argument), and displays the output of that layer once an image has been passed through."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# helper function for visualizing the output of a given layer\n",
    "# default number of filters is 4\n",
    "def viz_layer(layer, n_filters= 4):\n",
    "    fig = plt.figure(figsize=(20, 20))\n",
    "    \n",
    "    for i in range(n_filters):\n",
    "        ax = fig.add_subplot(1, n_filters, i+1)\n",
    "        # grab layer outputs\n",
    "        ax.imshow(np.squeeze(layer[0,i].data.numpy()), cmap='gray')\n",
    "        ax.set_title('Output %s' % str(i+1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's look at the output of a convolutional layer after a ReLu activation function is applied.\n",
    "\n",
    "#### ReLu activation\n",
    "\n",
    "A ReLu function turns all negative pixel values in 0's (black). See the equation pictured below for input pixel values, `x`. \n",
    "\n",
    "<img src='notebook_ims/relu_ex.png' height=50% width=50% />"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# plot original image\n",
    "plt.imshow(gray_img, cmap='gray')\n",
    "\n",
    "# visualize all filters\n",
    "fig = plt.figure(figsize=(12, 6))\n",
    "fig.subplots_adjust(left=0, right=1.5, bottom=0.8, top=1, hspace=0.05, wspace=0.05)\n",
    "for i in range(4):\n",
    "    ax = fig.add_subplot(1, 4, i+1, xticks=[], yticks=[])\n",
    "    ax.imshow(filters[i], cmap='gray')\n",
    "    ax.set_title('Filter %s' % str(i+1))\n",
    "\n",
    "    \n",
    "# convert the image into an input Tensor\n",
    "gray_img_tensor = torch.from_numpy(gray_img).unsqueeze(0).unsqueeze(1)\n",
    "\n",
    "# get all the layers \n",
    "conv_layer, activated_layer, pooled_layer = model(gray_img_tensor)\n",
    "\n",
    "# visualize the output of the activated conv layer\n",
    "viz_layer(activated_layer)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Visualize the output of the pooling layer\n",
    "\n",
    "Then, take a look at the output of a pooling layer. The pooling layer takes as input the feature maps pictured above and reduces the dimensionality of those maps, by some pooling factor, by constructing a new, smaller image of only the maximum (brightest) values in a given kernel area.\n",
    "\n",
    "Take a look at the values on the x, y axes to see how the image has changed size.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# visualize the output of the pooling layer\n",
    "viz_layer(pooled_layer)"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python [default]",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.4"
  },
  "widgets": {
   "state": {},
   "version": "1.1.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
